ASYNC Loop Constructs for Relaxed Synchronization
نویسندگان
چکیده
Conventional iterative solvers for partial differential equations impose strict data dependencies between each solution point and its neighbors. When implemented in OpenMP, they repeatedly execute barrier synchronization in each iterative step to ensure that data dependencies are strictly satisfied. We propose new parallel annotations to support an asynchronous computation model for iterative solvers. ASYNC DO annotates a loop whose iterations can be executed by multiple processors, as OpenMP parallel DO loops in Fortran (or parallel for loops in C), but it does not require barrier synchronization. ASYNC REDUCTION annotates a loop which performs parallel reduction operations but uses a relaxed tree barrier, instead of the conventional barrier, to synchronize the processors. When a number of ASYNC DO and ASYNC REDUCTION loops are embedded in an iterative loop annotated by ASYNC REGION, the iterative solver allows each data point to be updated using the value of its neighbors which may not be the most current, instead of forcing the processor to wait for the new value to arrive. We discuss how the compiler can transform an ASYNC REGION (with embedded ASYNC DO and ASYNC REDUCTION) into an OpenMP parallel section with relaxed synchronization. We present experimental results to show the benefit of using ASYNC loop constructs in 2D and 3D multigrid methods as well as an SOR-preconditioned conjugate gradient linear system solver.
منابع مشابه
Fourteen Ways to Fool Your Synchronizer
Transferring data between mutually asynchronous clock domains requires safe synchronization. However, the exact nature of synchronization sometimes eludes designers, and as a result synchronization circuits get “optimized” to the point where they do no longer operate correctly. This paper reviews a number of such cases, analyzes the causes of the errors, and offers a correct synchronizer circui...
متن کاملProving acceptability properties of relaxed nondeterministic approximate programs Citation
Approximate program transformations such as skipping tasks [29, 30], loop perforation [21, 22, 35], reduction sampling [38], multiple selectable implementations [3, 4, 16, 38], dynamic knobs [16], synchronization elimination [20, 32], approximate function memoization [11], and approximate data types [34] produce programs that can execute at a variety of points in an underlying performance versu...
متن کاملAn Analysis of Reshuffled Handshaking Expansions
We present a method for reasoning about the synchronization behavior of reshuffled handshaking expansions. The technique introduced converts the handshaking expansion into communicating hardware processes. We identify and discuss some of the limitations of the method. We show how the approach can be applied to analyze both the performance and the correctness of handshaking expansions.
متن کاملStudy and Refactoring of Android Asynchronous Programming
To avoid unresponsiveness, a core part of mobile development is asynchronous programming. Android provides several async constructs that developers can use. However, developers can still use the inappropriate async constructs, which result in memory leaks, lost results, and wasted energy. Fortunately, refactoring tools can eliminate these problems by transforming async code to use the appropria...
متن کاملRelaxed Synchronization and Eager Scheduling in MapReduce
MapReduce has emerged as a commonly-used programming model for large-scale distributed environments. While the underlying programming model based on maps and reductions has been shown to be effective in specific domains, significant questions relating to performance and application scope remain unresolved. This paper targets key questions of performance through relaxed semantics of underlying m...
متن کامل